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Abstract. The "carries" when n random numbers are added base h form a Markov chain 
with an "amazing" transition matrix determined by Holte 24|. This same Markov chain 
occurs in following the number of descents or rising sequences when n cards are repeatedly 
riffle shuffled. We give generating and symmetric function proofs and determine the rate 
of convergence of this Markov chain to stationarity. Similar results are given for type 
B shuffles. We also develop connections with Gaussian autoregressive processes and the 
Veronese mapping of commutative algebra. 



1. Introduction 

We use generating functions and symmetric function theory to explain a surprising coin- 
cidence: when n-long integers are added base-6, the distribution of "carries" is the same as 
the distribution of descents when n cards are repeatedly riffled shuffled. The explanation 
yields a sharp analysis of convergence to stationarity of the associated Markov chains. A 
similar analysis goes through for "type shuffles. In this introduction, we first explain 
the carries process, then riffle shuffling and finally the connection. 

f.l. Carries. Consider adding three 50-digit numbers base 10 (in the top row, italics are 
used to indicate the carries): 

1 12021 01111 11111 11111 11011 10111 01111 11111 21011 1112 

43935 23749 58561 74916 62215 47448 33196 51990 19807 27075 

48537 53642 77448 32760 14421 72142 82116 37225 43300 51498 

33618 41327 41561 16257 43616 55134 82714 63369 87142 45607 

1 26091 18719 77571 23934 20253 74725 98027 52585 50250 24180 



For this example, 6/50 = 12% of the columns have a carry of zero, 40/50 = 80% have a 
carry of one and 4/50 = 8% have a carry of two. 

If n integers (base h) are produced by choosing their digits uniformly at random in 
{0, 1, 2, . . . , 6 — 1}, the sequence of carries kq = 0, ki, K2, . . . forms a Markov chain taking 
values in {0, 1, 2, . . . , n — 1}. Holte [13] studied this Markov chain and found fascinating 
structure in its "amazing" transition matrix (P(i,j)). Here P(i,j) is the chance that the 
next carry is j given that the last carry was i, and he showed, for < i, j < n — 1, that 

-Li/bj 
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For example, when n = 3 the matrix becomes 

/62 + 36 + 2 462 _ 4 ^2 _ 35 _^ 2^ 
^ 6^-1 46^ + 2 6^-1 

\62- 36 + 2 462-4 6^ + 36 + 2^ 

Among many other things, Holte shows that the jth entry of the left eigenvector with 
eigenvalue 1 is A{n, j)/n\, with A(n,j) the Eulerian number: the number of permutations 
in the symmetric group 5„ with j descents. Here a G S„ is said to have a descent at i 
if a{i + 1) < So 51324 has two descents. The fundamental theorem of Markov 

chain theory gives that A{n,j)/n\ is the long term frequency of carries of j when long 
random numbers are added. Note that this is independent of the base 6. When n = 3, 
A(3,0)/6 = 1/6, ^(3,l)/6 = 2/3, A(3,2)/6 = 1/6 very roughly m atching the example 
above. We give alternative derivations of this at the end of [Section 2[ 

We will not detail the many nice properties Holte found but warmly recommend his paper 



24l |. Some further properties are in [9|], [15|], which give appearances of this same matrix 
in card shuffling and in the Veronese construction for graded algebras. This is developed 
briefly in [Section "51 below. 

1.2. Shuffling. The usual method of shuffling cards proceeds by cutting a deck of n cards 
into two approximately equal piles and then riffling the two piles together into one pile. A 
realistic mathematical model was created by Gilbert-Shannon-Reeds: cut off c cards with 
probability (")/2". Drop cards sequentially as follows: if the left pile has A cards and the 
right pile has B cards, drop the next card from the bottom of the left pile with probability 
A/{A + B) and from the right pile with probability B/{A + B). This is continued until all 
cards are dropped. 

A careful analysis of riffle shuffles is carried out in using a generalization to 6-shuffles. 
There, a deck of cards is cut into 6 packets of size ci, C2, . . . , with probability {^-^^ /&"• 
The packets are riffled together by dropping the next card with probability proportional 
to packet size. Thus the original Gilbert-Shannon-Reeds model corresponds to a 2-shuffle. 
Two basic facts established in i4| are: 



(1.2) 



The chance of the permutation a arising after a 6-shuffle is 



n 



6" 

with d{a~^) the number of descents in a~^. 
• An a-shuffle followed by a 6-shuffle is the same as an a6-shuffle. 

Thus the result of r 2-shuffles is the same as a single 2^ shuffle and so formula ()1.2p gives 
a closed form expression for the chance of any permutation after r 2-shuffles. This and 
some calculus allow a sharp analysis of the rate of convergence: roughly | log2 n + c shuffles 
sufflce to make the distribution within 2~'^ of the uniform distribution. Further details are 
in d. 

The combinatorics of riffle shuffles has expanded. An enumerative theory of cycle and 
other properties under the 6-shuffle measure (11.21) i s equivalent to the Gessel-Reutenauer 
enumeration jointly by cycles and descents jlSl . 1231 ]. The combinatorics of riffle shuffling 
is essentially the same as quasi-symmetric function theory [12, HI] • There are extensions 
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to other types fs ee ISection 41 below) and to random walk on the chambers of hyperplane 
arrangements [a, H] and buildings llQ]. Much of this development is surveyed in 
Interesting new developments are in [jj. 

1.3. The connection. Carries and riffle shuffling seem like different subjects. However, if 



Pb denotes the matrix (jl.ip . Holte [2J] showed that 

(1.3) PaPb = Pab 

The eigenvalues of the matrix P^ turn out to be the same as the eigenvalues of the 6-shuffle 
transition matrix (the multiplicities are different). This, and the appearance of descents in 
both subjects, led us to suspect and then prove an intimate connection. In ISection 2l we 
prove the following. 

Theorem 1.1. The chance that the hase-h carries chain goes from to j in r steps is equal 
to the chance that the permutation in Sn obtained by performing r successive b-shufftes 
(started at the identity) has j descents. 

We give a generating function proof which also yields a similar statement for the inverse 
permutation along with enumerative results of Gessel in ISection 21 We have subsequently 
found a bijective proof of the theorem which shows that the transition matrices of carries 
(II. ip and the Markov chain generated by the number of descents after successive 6-shuffles 
are the same [l5| . 

The more analytic proof given here allows us to use the Robinson-Schensted-Knuth 
(RSK) correspondence and symmetric function theory to show that the number of descents 
(and in fact any function of the descent set) after r 2-shuffles is close to stationarity when 
r = log2 n + c. (Note from [j] that | log2 n+c are required for all aspects of the permutation 
to be close to stationarity.) The correspondence with carries shows that the carries chain 
'settles down' after log2 n + c. Refining this, we show that for large n, ^ logb(n) + c steps of 
the carries chain are necessary and sufficient for convergence to stationarity. Details are in 
ISection 3l 

The discussion so far has all been on the permutation group. There are well-established 
"type i?" (hyperoctahedral)-shuffles [1, d, Hlf. In ISection 41 we develop a parallel "carries 
process" and show that theorems about type B shuffles translate into theorems about adding 
numbers. We also point out a connection with the theory of rounding. ISection "51 shows that 
for large n, the carries process is well approximated by a Gaussian autoregressive process, 
and develops the connection with the Veronese mapping of commutative algebra. 

2. Two Markov Chains 

In this section we show that two processes derived from the Markov chain of repeated 
6-shuffles on the symmetric group are Markov chains with transition probabilities from to 
j, the same as the carries chain. As background, note that usually a function of a Markov 
chain is not a Markov chain. A simple example is nearest neighbor random walk on the 
integers mod n, with n odd, n > 7. Let the walk start at and move left or right with 
probability 1/2. Let f{j) = 1 for < j < (n — l)/2, /(j) = —1 otherwise. If steps of the 
original walk are denoted Xq = 0,Xi,X2, ■ ■ ■ and Yj = f(Xj), then {Xj}JLQ is a Markov 
chain but {Yj}JLo is not: ¥{¥3 = +{¥2 = +} = 2/3, ¥{¥3 = +{¥2 = +,¥1 = +} = 1. The 
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literature on conditions for Markovianity are often called "lumping of Markov chains." A 
useful introduction is [26] with [1^ a sophisticated extension. 

To begin, we show that the two basic facts about riffle shuffles give a generating function 
identity of Gessel (unpublished). 

Proposition 2.1. Let a be a permutation with d descents. Let cfj be the number of ordered 
pairs (r, /i) of permutations in 5„ such that r has i descents, // has j descents, and Tfi = a. 
Then 

cf/+H3+^ _ /n + ab-d-V 

i,j>0 ^ ^ ^ a,b>0 



n 



S^t^ 



Proof. Since an a-shuffle followed by a 6-shuffie is an aft-shufSe, the formula (jl.2p implies 
that 



E 



n + a — d{^) 



n 



n + b 



-d{r) 
n 



1 



E 



n + ab — d{a) — 1 



n 



a 



Multiplying both sides by s°'t , summing over all a, 6 > 0, and then taking the coefficient of 
a^^ on both sides yields that 



E 

a,b>0 

E 

Tfl — (T 



n + ab — d 
n 



aj.b 



S^t 



E»° 



n + a — d{fi) — 1 



a>0 

d{^l)+l 



n 



fe>0 



n 



+ b- d{T) - 1 



n 



Tfl — (T 



(1 - (1 - t)" 



+ 1 



En 



i,j>0 



(1 - - t)"- 



-1 ■ 



□ 



Recall that if a Markov chain has transition probabilities P{i,j), its formal time reversal 
with respect to a stationary measure vr is defined to have transition probabilities P*{i,j) = 
-^^^1^1^ ■ This P* is a Markov transition matrix which also has vr as stationary measure. 
A Markov chain P is reversible with respect to tt if and only if P = P*. 

Theorem 12.21 identifies the carries Markov chain with the formal time reversal of a chain 
arising in the theory of riffle shuffles. As in the introduction, tt denotes the distribution on 
{0, 1, . . . ,n — 1} defined by 7r(j) = — ^f^, where A{n,j) is the number of permutations in 
Sn with j descents. 

Theorem 2.2. Let a Markov chain on the symmetric group Sn begin at the identity and 
proceed by successive independent b-shuffles. Then the number of descents of forms a 
Markov chain with stationary distribution 7r(j) = ^i^^hJl^ Qf^d Hg formal time reversal with 
respect to vr is identical with the carries Markov chain. 
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Proof. Let d{T~^) denote the number of descents of the inverse of the permutation ob- 
tained after r independent 6-shuffles. Corollary 2 of [1] showed that (i(r^7^) forms a Markov 
chain. Note that the stationary distribution of this chain is given by Tr{j) = — ^f^, since 
tends to a uniform element of as r ^ oo. 
We compute the transition probabilities of the Markov chain formed by d(T^^). By (jl.2p . 

mr;-\) = = ^^""^17-1): Clearly 
F{d{T;\) = i,diT-')=j) 

fn+b'^-'^-i-U fn+b-k-l\ 

E { n ) \ n ) 

o-:d(o— i)=i k>0 /j:ti(M-i)=fc 

d{<T-iM-i)=j 



Thus 

F{d{T-')=j\d{T-_\)=i) 



F{d{T;\)=i,d{T-^)=j) 

F{d{T-},) = i) 

1 (n+b-k-l\ 

E E E ' " ' 



A(n,i) ^ ^ ^ b" 

d{<T-lM-l)=j 



In the notation of Proposition I2.H this is 

A{n,j) 1 ?■ fn + h — k — 1 

^ ' ' k>o ^ 

Letting [x'^]f[x) denote the coefficient of in a series f{x), this can be rewritten as 
, A{n,j) 1 ■ t'+' 



^ ^A{n,i) 6" *'=(l-s)"+i(l-t)"+i' 
By Proposition 12.11 this is equal to 



r,fcj+ii ^(ra,j) (1 - 5)"+' /n + ad - j - A ^ . 

^ ' ^ a,d>0 ^ ^ 

W+1 



^ ,+1^ A(n, j) (1 - 5)"+^ ^ + a6 - i - l^^ , 

^ ' ' a>0 ^ 



^ ^(n,i) 1 y /n + 1\ /n - 1 - j + + 1 - 06\ 

This is equal to where P is the transition probability of the carries chain (jl.ip . □ 

The next result gives a second, more direct, interpretation of the transition probabilities 
of the carries chain. 
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Theorem 2.3. The chance that the base-b carries chain goes from to j in r steps is 
equal to the chance that a permutation in Sn obtained by performing r successive b-shufftes 
(started at the identity) has j descents. 



Proof. By (|1.2p and the fact that an a-shuffle followed by a 6-shuffle is an a6-shuffle, the 
chance that r successive 6-shuffles (started at the identity) lead to a permutation with j 
descents is 



(2.1) E 



J_fn + b'- - i-l\ 
>o ^ ^ 



where as in Proposition 12. 1^ c^j denotes the number oi a £ Sn such that d{a ^) = i and 
d{a)=j. 

Proposition 12.11 gives that 



i,k>0 ^ ^ ^ ^ a,d>0 ^ 



(l_s)n+lQ_i)n+l - 2^ 

Taking the coefficient of s^^ on both sides gives that 



(l_t)n+l n ' 

;,fc>o ^ ' d>o ^ 



Comparing with equation (j2.ip gives that the chance that a permutation obtained after r 
successive 6-shuffles has j descents is 

Prom (jl.ip . this is equal to the carries transition probability Pfer(0,j). By equation (jl.3p . 
this is P^{0,j), as claimed. □ 

We conclude this section with two alternative derivations of the stationary distribution 
of the carries chain. The following lemma will be helpful. Stanley |31l] and Pitman j28j] give 
bijective proofs. 

Lemma 2.4. Let Xi, . . . , be independent uniform [0,1] random variables. Then for 
all integers j, P(j — 1 < Y17=i -^i < i) ^-^ equal to the probability that a uniformly chosen 
random permutation on n symbols has j descents. 

As usual we let P^(0, j) denote the distribution on {0, 1, . . . , n — 1} after r steps of the 
carries chain (for the base b addition of n numbers) started from 0. 

Theorem 2.5. ([24]) The stationary distribution it of the carries chain satisfies 7r(j) = 
"^^"'-''^ , where A{n,j) is the number of permutations in Sn with j descents. 
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Proof. By Holte [24] , r steps of the base b carries chain is equivalent to one step of the base 
b^ carries chain. Letting Yi, . . . ,Ynhe discrete i.i.d. uniforms on {0,1, . . . ,b^ — 1}, it follows 
that 



P'iO,J)=Fijb^<J2Y^<{j + lW 



i=l 



Letting Ui, . . . ,Un be continuous i.i.d. uniforms on [0, ft*"], this implies that 

p'io,j) = F (jb^ <j2m < u + w] 



1=1 

n 



\ i=i i=i 



\ 1=1 / 

Here the Xi = ^ are i.i.d. uniforms on [0, 1] and E = ^ YH=i{Ui - 

Although E is not independent of the XiS, note that when n is fixed and r oo, E 
converges in probability to 0. Indeed, this follows since < ^ with probability 1. Thus 
Slutsky's theorem implies that 

hm P^(0,j) = F (i < VXi < j + 1 1 , 

\ i=l / 

and the result follows from Lemma |2.4[ □ 



A simple analytic way to find the stationary distribution uses the closed form for P'"(0, j). 
As r tends to infinity, 

1 (n-l + {j + l- l)b''\ {j + l-lY 



Thus by ([Ll]) and (fOjl . 

3 



«=0 ^ ^ ^ ^ 



n! 



/=0 

The last equality is an identity, due to Euler, for the A{n,j) [1 

3. Rates of Convergence 

This section presents both upper and lower bounds on convergence to stationarity for 
the equivalent Markov chains of [Section "21 Theorem 13.21 shows that the descent set of a 
permutation (not just the number of descents) is close to its stationary distribution after 
r 6-shuffles if r = logb(n) + c. This uses symmetric function theory. Theorem 13.31 uses 
stochastic monotonicity to bound convergence of the carries chain: it shows that at least 
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r = \ logft(n) + c steps are needed and that r = logfe(n) + c steps suffice. Theorem 13.41 shows 
that for large n, ^ logfe(n) + c steps are sufficient. 

All of our results involve the total variation distance between probability measures P and 
Q on a finite set X, defined as 

11^ - qiitv = \y. i^(^) - = s '^^^^ " 

X ~ 

Theorem 3.1. Consider the carries chain for base b addition of n numbers. Let r = 
\logbicn)~\ with c > 0. Let Pq denote the distribution on {0, 1, ... ,n — 1} given by taking r 
steps in the carries chain, started from 0. Let vr be the stationary distribution of the carries 
chain. Then 

||^'o^-vr||rv< -\/ei/(2c2)-l. 

In fact, we prove a stronger result. This uses the notion of the descent set of a permutation 
o", defined as the set of i, 1 < i < n — 1, such that a{i) > a{i + 1). For instance 513 2 4 has 
descent set {1,3}. Let P'^{S) denote the probability that a permutation obtained after the 
iteration of r 6-shuffles (or equivalently a single 5''-shuffle) has descent set S, and let tt{S) 
denote the chance that a uniformly chosen random permutation has descent set S. Theorem 
13.21 uses symmetric function theory to upper bound the total variation distance between P^ 
and vf . Chapter 7 of the text ^] provides background on the concepts used in the proof of 
Theorem 13.21 (i.e. Young tableaux, the RSK correspondence, and symmetric functions). In 
0] it is shown that the descent set is a Markov chain when cards are repeatedly 6-shuffled. 



Theorem 3.2. Let r = [logb(cn)] with c > 0. Then with the notation of Theorem \3.1\ 

11?^ -^Ilrv < i\/ei/(2c2) _i. 

Proof. We use the RSK correspondence which associates to a permutation a a pair of 
standard Young tableaux (P, Q) called the insertion and recording tableau of cr respectively. 
One says that a standard Young tableau T has a descent atz(l<i<n — l)ifi + lisina 
row lower than i in T. We let d{T) denote the number of descents of T. By Lemma 7.23.1 
of [s^], the descent set of a is equal to the descent set of Q{cr). This implies that 

|A|=n 

where fx is the number of standard Young tableaux of shape A, and fx{S) is the number 
of standard Young tableaux of shape A with descent set S. 

From Theorem 3 of [12], the chance that Q{cr) = T (for cr obtained from a V shuffle) is 
sa(^, • • • , jf) for any standard Young tableau T. Here coordinates of the Schur function 
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s\ are equal to ^ and the rest are 0. Thus, 



5C{l,...,n-l} 



E 



< 



S \X\=n 
S \X\=n 



fx{S)sx (—,...,— 



fx{S)f\ 



-E 

|A|=n 

-E 

|A|=n 



SX 



fx{S)sx 
1 



1 



1 1 
^'■■■'^ 

n! 



fx{S)fx 



JX'^X \ hr'- ■ ■ ■• If- j ^\ 



By the Cauchy-Schwarz inequahty, this is at most 




"!E 

|A|=n 













n! 



The functions fxsx{-^, • • • i ^) and ^ both define probabihty measures on the set of parti- 
tions of size n; the first is the distribution on RSK shapes after a riffle shuffle [s^], and 
the second is known as Plancherel measure. Hence the previous expression simplifies to 



n 



1 



SX 1 ^, 



1 



|A|=n 

Let [u^]f{u) denote the coefficient of in a series /(«). By the Cauchy identity for Schur 
functions [33, p. 322], 

2 

|A|>0 

n 



|A|=n ^ 



1 



Thus 



nl ^ SA 

|A|=n 



6^ ' ■ ■ ■ ' 6^ 



1. 
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Since log(l + x) < x for x > 0, it follows that 



log n 



^i=l 



n-1 



1 + 



52r 



52r 



Thus 



n-1 

n 



1 + 



52r 



1 < exp 




1. 



Summarizing, it has been shown that 



exp 




1. 



If W 



cn with c > 0, then -^r 



< 2^-, which proves the result. 



□ 



Proof of Theorem lA'. il Theorem 12.31 showed that the base-5 carries chain started from is 
the same as the chain for the number of descents after successive 6-shuffles started from 
the identity. Thus Theorem 13.21 also upper bounds the total variation distance between r 
iterations of the base-?) carries chain (started from 0) and its stationary distribution. □ 

Next we give a different approach to proving convergence using stochastic monotonicity 
and also give a lower bound. The arguments show that logj, n+c steps suffice for convergence 
and that ^ log^ n steps are not enough. 

Theorem 3.3. For n > 3, any starting state i, and any r > 0, the Markov chain P of 
(jl.ip satisfies 

' n — 1 



\P'^{h-)-M\TV< 



Conversely, for any e, < e < 1, if 1 < r < logj, 



+ 1 



then 



|P'^(i,-)-vr||ri/> 1-e. 



Proof. Recall that a Markov chain on {0,1,..., n — 1} is stochastically monotone if for 
all i < i' , P{i,{0, . . . ,j}) > {0, . . . , j}) for all j. We show that P is stochastically 
monotone by coupling. Consider two copies of the carries chain, one at i and one at i' with 
i < i' . Each chain proceeds by adding n random base-6 digits. Couple them by adding the 
same digits to both. If the first process results in a carry of k, the second process results 
in a carry of A: or A: + 1. This implies stochastic monotonicity. 

From Holte 2J, Th. 4] and the fact that n > 3, the right eigenfunctions for eigenvalues 
p can be taken as 



fiii) 



n 



1 



f2{i) = - {n - l)i + 



(n-2)(3n- 1) 



The upper bound follows from stochastic monotonicity and the first eigenvector via jl^, 
Th. 2.1]. For the lower bound, note that /i = /2 + A, with A = This, and a simple 
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computation show that 

ihix) -fiiy)f Fix, dy) 



I' 



This is the required input for the lower bound, using Th. 2.3]. One obtains that 



||P'"(z, •) — vtIItv > 1 — e for r < logf, 
when n > 3. 



^8{n+l)/12 



, and the result follows since 



8(n+l) 
12 



< n 
□ 



Remark. The argument for stochastic monotonicity does not depend on the assumption 
that the digits are uniform and independently distributed. Any joint distribution within a 
column (with columns independent) leads to a stochastically monotone Markov chain. In 
[isl it is shown that the transition matrix P is totally positive of order 2. This implies 
stochastic monotonicity via [25l . Prop. 1.3.1, p. 22]. 

To close this section, we prove that ^log^(n) + c steps are sufficient for total variation 
convergence when n is large. 

Theorem 3.4. With the notation of Theorem \ 3. 1\ there is a constant B > (independent 
of n,b,c> 1) such that for r = ^ log{,(nc), 

n ^ B B 
l|FJ--r|b.<^ + ^. 

Proof. From Theorem 4.3 of jTsI], there is j* G {0, 1, • • • ,n — 1} such that P'^{0,j) > 7r(j) 
for < j < j*, and that P''{0,j) < 7r(j) for j* + l<j <n-l. Thus 



(3.1) 



iPo'-vrllTy = Po'{0,l,--- } - vr{0, 1, • • • ,f}. 



From the proof of Theorem [231 P''{0,j) = ^{jV < Y.'^^i Yi < U + with Yi i.i.d. uni- 
form on {0, 1, • • • , 6'' - 1}. From Lemma[231 7r(j) = P (j < ^lILi Ui < j + 1) with Ui i.i.d. 
uniform on [0, 1]. Thus 



\ 1=1 



i andPo''{0,l,-- - ,f} 



< i* + 1 



7r(j)=P Ec^d =j and 7r{0,l,--- } = P C/, < f + 1 



i=l 



From the above considerations, we have 



,i=l 



(3.2) 



\Pq — t^Wtv < sup 



' ^ n 



< X 



1=1 



^Ui<x 

\i=l / 
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Let = |, cr„ = yI and i^n 
sup 



r2 = The right-hand side of ^ is 



< sup 

y 



< y 



+ sup 

y 



< y 



I + 11. 



Here <&fy) = e ^^^^dt denotes the cumulative distribution function of the normal 

distribution. 

From the usual Berry-Esseen bound, // < Bi/y/nwith Bi involving the second and third 
moments of the uniform on [0, 1], uniformly bounded. Rewrite I as 



sup 

y 



< sup 



< 



< Z 



<I>(z) 



Hy) 



+ sup|^>(z) - ^{aiz + a2) 



with oi = Tn/cTn, CL2 = {i^n — tJ"n)/o'n- Using the Berry-Esseen bound again, the first term is 
bounded above by B^j \[n with B^ involving the ratio 



E 



1 



E 



y\ _ V - 1 



26^ 



3/2 



This is uniformly bounded in 6, n, c. To bound the final term, we use the following inequality: 

for any G R,cr^ G M+, 



sup - ^{az + n)\ < - l\ + |//| 



(3.3) 



An elegant proof of (|3.3p using Stein's identity was communicated by Sourav Chatterjee. 
Let W be Normal(/x, o"^) and Z be Normal(0, 1). For any bounded / with a bounded, 
piecewise continuous derivative, E{Wf{W)) = iM{f{W)) + <t^E(/'(VF)) (Stein's identity 
being used). Thus 

nwf{w) - f{w)) = nnf{w)) + {(T^ - i)E(/'(Ty)). 

As in [13, p. 22], choose /^q so that for all w, one has 

(3.4) wfujoiw) - fl^g{w) = 6uj<ujo - '^(wo). 

Here wq is fixed. Stein shows that \fwoiw)\ < for all w, and that 1/40(^^)1 < 1 for all 



w. Taking expectations in 
follows that 



proves 



Taking ^2 = (1- ^), ^ 



3n 



sup |$(z) - ^{aiz + a2)\ < B^i/y/c 

z 

with i?3 independent of n, 6, c > 1. 



, it 



□ 
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Remark. If W is Normal(z^, r^) and Z is Normal(/x, cr^), the bound p.3p shows that the 
Kolmogorov distance between their distributions is at most 



mm 



27r 



+ 



^-1 



a 



^-1 



4. Signed Permutations 

Let Bn, the hyperoctahedral group, be represented as signed permutations. Thus Bn 
has 2*^71! elements. We associate elements of Bn to arrangements of a deck of n cards with 
cards allowed to be face up or face down. A natural analog of the Gilbert-Shannon-Reeds 
shuffling model was studied in [3] ; the deck is cut approximately in half, the top half turned 
face up, and the two halves are riffled together according to the G-S-R prescription. These 
shuffles have similarly neat combinatorial properties which allow sharp analysis of mixing 
times. Of course, shuffling is a natural algebraic operation and type B shuffles have been 
studied from an algebraic viewpoint (with applications to Hochschild homology) in 0], 
[3], and 21]. This section develops a corresponding carries process in rough parallel with 
[Section "21 We also give an application to the theory of rounding. 

From the previous sections, we see that the key idea is to use the fact that an a-shuffle 
followed by a 6-shuffle is equivalent to an a6-shuffle. A hyperoctahedral analog of (2a + 1)- 
shuffles was considered in @] (see also (21I ] for connections with the afflne Weyl group). A 
(2a + l)-shuffle is defined by multinomially cutting the deck into 2a + 1 piles, then flipping 
over the even numbered piles, and riffling them together. 

View Bn as the signed permutations on n symbols, using the linear ordering 



1< 2 < 



< n < —n < 



< -2 < -1. 



Say that 

(1) a has a descent at position i [1 < i < n — 1) if a{i) > a{i + 1). 

(2) a has a descent at position n if o"(n) < 0. 

For example, —1 — 2 — 3 G -B3 has three descents. Let A{n,j) denote the number of 
elements of Bn with j descents. The Bergerons 0] give analogs of basic properties of riffle 
shuffles. More precisely, they show that if a Markov chain on the hyperoctahedral group 
begins at the identity and proceeds by successive independent {2h + l)-shuffles, then 

• The chance of obtaining the signed permutation r after r steps is 



(4.1) 



-d(r-^) 



) 



(25 + 1)™ 

A (2a + l)-shuffle followed by a (26 + l)-shuffle is equivalent to a (2a + 1)(26 + 1)- 
shuffle. 



Using these gives a type B analog of Proposition 12. 1[ Gessel also has an unpublished proof 
of Proposition 14.11 using P-partitions. 
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Proposition 4.1. Let a G Bn have d descents. Let c^j be the number of ordered pairs (r, fi) 
of elements of Bn such that r has i descents, fi has j descents, and Tfi = a. Then 

^ cfjsV ^ ^ i'n + 2ab + a + b-d\ ^ 

2^ (l_s)n+i(i_i)n+i 2^1 n r ■ 

i,j>0 ^ ^ ^ ^ a,b>0 ^ ^ 

Proof. Since a (2a+l)-shuffle followed by a (26+l)-sliuffle is equivalent to a (2a + l)(26+l)- 
shuffle, the r = 1 case of (14. Ij) gives that 



n + a — d{fi)\ _i fn + b — d{T)\ fn + 2ab + a + b — d{a)\ _i 



^ \ n J ^ \ n I — ' \ n 

As in the proof of Proposition 12.11 one multiplies both sides by s°t^, sums over all a, 6 > 0, 
and then takes the coefficient of on both sides to obtain the result. □ 

Next we define a "type i?" carries process, to which we will relate the type B hyperoc- 
tahedral shuffle. This is defined as the usual carries process, where one adds n length m 
numbers base 26 + 1, and to these adds the length m number {b,b, . . . ,b). Note that the 
state space of the type B carries chain is {0, 1, . . . , n} (for usual carries, the most one can 
carry is n — 1). For example when 6 = 1 (so 26+1 = 3), adding 222 and 201 followed by 
appending 111 gives 

2 11 
2 2 2 
2 1 
1 1 1 



2 11 
with carries kq = 0, ki = 1, K2 = 1, K3 = 2. 

Theorem 4.2. For < i, j < n, 

(1) The transition probabilities of the type B carries chain are 

„r , 1 ,A!+l\/n + (i- 0(26 +!) + (> -A 

(2) The r-step transition probabilities of the type B carries chain are 

-<-..^^E<-.'("tO("'""'''''r"^" 

(i.e. one replaces 26 + 1 by (26 + l)*" in part 1). 

Proof. From the definition of the type B carries chain, 

P(i, j) = P (j(26 + l)-b<i + Xi + --- + Xn< j{2b + 1) + 6) 

where Xi , . . . , X„ are independent and identically distributed discrete uniform random vari- 
ables in {0, 1, • • • , 26}. Equivalently, 

P{i,j) = (26 + 1) • P (i + Xi + • • • + X„ + y = j(26 + 1) + 6) , 
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where Xi, . . . , Xn,Y are i.i.d. discrete uniforms in {0, 1, • • • , 26}. Letting [x^]f{x) denote 
the coefficient of in a series /(x), it fohows that 



f 1 " L 



(26 + 1) 
1 



.j{2b+l)+b-i 



l_^26+lxn+l 



(26 + 1) 



l>0 



n + 1 
I 



1-x 

'^{j~l)i2b+l)+b-i 



1 — X 



n+1 



(26 + 



n + l\ fn + (j - /)(26 + 1) + 6 - i 
n 



Thus the first part is proved. 

To prove the second half of the theorem, we show that r steps of the base- (26 + 1) carries 
chain is equivalent to one step of the base (26 + l)** carries chain. To compute the carry 
after r steps of the type carries chain, add 6 (1 + (26 + 1) + • • • + (26 + l)*""^) to the 
sum of n length r numbers base 26 + 1. To compute the carry after one step of the type Bn 
base (26 + 1)'' carries chain, add ^'^^^^ ~^ to the sum of n length 1 numbers base (26 + 1)^. 
These computations are equivalent, so the result follows by replacing 26 + 1 by (26 + l)** in 
part 1. □ 



Now we relate hyperoctahedral shuffles to type B carries. In what follows, vr denotes the 
distribution on {0, 1, . . . , n} defined by 7r(j) 



Theorem 4.3. Let a Markov chain on the hyperoctahedral group Bn begin at the identity 
and proceed by successive independent (2b +1) -shuffles. Then the number of descents of t^^ 
forms a Markov chain, and its formal time reversal with respect to its stationary distribution 
TT is identical with the carries Markov chain. 

Proof. Let be the element of Bn obtained after r independent 6-shuffles (started at the 



identity). Arguing as in the proof of Theorem 
with transition probabilities 



gives that d{T^ ) forms a Markov chain 



F{d{i 



j\dK\) = i) 



A{n,j) 



A{n,i){2b + 



I k>0 



n + 6 - 
n 



Here c^^ is as in the statement of Proposition 14. 1[ 

Letting [x^]f{x) denote the coefficient of x^ in a series f{x), the transition probability 
in the previous paragraph can be written as 



A{n,j) 



A{n,i){2b + 1) 



1 w X] ^fe (I 



t^s' 



A(n,i) 



A{n,i){2b + l) 



t) 



"(1 



ra+l 



\n+l 



sH^ 



i,k>0 



(l-s)"+i(l-t)"+i' 
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By Proposition 14.11 this is equal to 



A{n,i){2b + 



l>0 ^ ^ c>0 



A{n,2) sr^, ^.ifn + l\fn + {i-l){2b+l)+b-j 



A{n,i){2b + 



n 



Comparing with Theorem 14.21 this is equal to ~~:^y^) as needed. □ 

The next theorem is easily proved by the technique used to prove Theorem 12.31 (using 
Proposition 14.11 instead of Proposition 12. ip . 

Theorem 4.4. The chance that the type B carries chain goes from to j in r steps is equal 
to the chance that an element of Bn obtained by performing r successive {2b + l)-shuffles 
(started at the identity) has j descents. 



The following corollary is immediate from Theorem | 

Corollary 4.5. The stationary distribution of the type B carries chain is given by 7r(j) 
^Fq^, where A{n,j) is the number of signed permutations on n symbols with j descents. 



Corollarv 14.61 gives a closed formula for A(n, j). (This can also be obtained by combining 
Proposition 14.71 below with equation (19) of |l2l|). 



Corollary 4.6. 



A{n,j)=j2i-^y(^^^) (2j-2/ + ir. 



1=0 

Proof. Let r ^ oo in part 2 of Theorem 14.21 and apply Corollary 14.51 □ 

As an application of the above results, we give a new proof of the following lovely fact from 
(30| (see also Section 9 of [l3| for closely related results). Note that it can be interpreted 
as computing the chance that the sum of n i.i.d. uniforms on [0, 1], when rounded to the 
nearest integer, is equal to j. 

Proposition 4.7. Let Ui, . . . ,Un be independent, identically distributed continuous uniform 
random variables in [0, 1]. Then 



j--<Ul + --- + Un<j+ ^ - ^ 



2 - ' 2) 2'^n\ 

Proof. Let Xi, . . . , X„ be i.i.d. discrete uniforms on {0, 1, . . . , 26}. From the definition of 
the type B^ base-(26 + 1) carries chain, 

P(0, j) = P (^{2b + 1) - 6 < ^ X, < j(26 + 1) + 6 + 1^ . 
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Let Yi, . . . ,Yn be i.i.d. continuous uniforms on [0, 26 + 1]. Then it follows that 
P(0,j) = p[j(26 + l)-6<f]L^,J < j(26 + l)+6 + l) 



i=l 



= P (^{2b + l) (j - < pY - - ™ -l<{2b + l) (j + 

Here the Ui = Si,Te i.i.d. continuous uniforms on [0, 1], and 

n 

Note that when n is fixed and b ^ oo, E converges to with probability 1. Thus Slutsky's 
theorem implies that 



b 

However by Theorem 14.21 and Corollarv 14.5 



hm p{o,j) = wlj-l<y2Ui<j + l] 



lim P(0,j)=^(j)-^^'''-^^ 



which completes the proof. □ 

5. Two Final Topics 

The carries matrix also comes up in studying sections of generating functions via the 
Veronese map. The large n limit of the carries process is well approximated by a classical 
auto-regressive process. 

5.1. Eulerian polynomials and Hilbert series of Veronese subrings. Some natural 
sequences afc, < A; < oo have generating functions 



akx- 

k=0 



E 



(1 - 



with h{x) = ho + hix + • • • + hn+ix^'^^ a polynomial of degree at most n + 1. Suppose we 
are interested in every 5th term {ai,k}, < k < oo. It is not hard to see that 

^ fc_ h«'>{x) 

fe=o ^ ' 

for another polynomial /i^^^(x) of degree at most n + 1. The study of these generating 
functions arises naturally in algebraic geometry 19] and lattice point enumeration [^. 
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Brenti and Welker [§] show that the ith coefficient of /i^^^ (x) satisfies 



n+1 
j=0 



with C an (n + 2) X (n + 2) matrix with entry (0 < z,j < n + 1) equal to the number of 
solutions to oi + • • • + a^+i = ib — j where < a; < 6 — 1 are integers. In [15| we show that 
the n X n matrix given by deleting the first and last rows and columns of C, then multiplying 
by and taking the transpose is precisely the carries matrix (P{i,j)) of (jl.ip . 

Since iterates of the carries chain converge, the matrix C{i,j) has nice limiting behav- 
ior. Brenti and Welker show that the zeros of h^''^ converge and Beck-Stapledon ^ 
show that the zeros converge to the zeros of the nth Eulerian polynomial Pnix), defined as 
'^jyQA{n,j)x^^^, where A{n,j) is the number of permutations in Sn with j descents. The 
following is a refinement. 

Theorem 5.1. Suppose that h(l) ^ and let pn{x) be the nth Eulerian polynomial. Then 
as b ^ oo with n fixed, 



h<^>{x) 
6" • h{l) 



Pn{x) 



Proof. Let [y ]f{y) denote the coefficient of y in a power series /(y). Then the definition 
of C{i,j) gives that 



/-I _ „ b\n+l 

ib-j] u y ) 



l>0 



(1 _ y)n+l 

n+1 
I 



l>0 



I 



{i-l)b-j 



1 



(1 - y)"+i 



n + \\ (n + [i — l)b — j 



n 



Supposing that 1 < i < n, it follows that 
C[i.3) 



(5.1) 



lim 



^E(-i)'("r)<'-"" = ^^ 



where the second equality uses a well-known formula for Eulerian numbers [13(]. Since 

C(0,i) = (5oj and C(n+ 1, j) = 5n+ij, clearly 



(5.2) 
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Combining equations (I5.ip and (15. 2p yields that 

/!<''> (X) 



lim 



lim 



b^oo 6" • h{l) b^oo 6" • h{l 

ELi \A{n,^-l)E 



3=0 "-J 



n\-h{l) 



Pnix) 



n! 



□ 



Here is an example. The coordinate ring i? of a projective variety in n + 1 variables 
decomposes into its graded pieces Rk, < k < oo and the Hilbert series has the form [1, 
Th. 11.1] 



k=0 



The bth Veronese embedding replaces the variables by all degree b monomials in these 
variables. (If 6 = 3, {x,y} are replaced by ,x^y,xy^ .) The image of the coordinate 
ring has Hilbert series h'^^\x)/{l — x)""*"^. As a simple special case, the full projective space 
has coordinate ring C[xi . . . Xn+i]. The degree k homogeneous polynomials have dimension 
and 



(5.3) 

When n + 1 = 2, 

oo 
k=0 

When n + 1 = 3, 



E/ n + A; \ j^. _ 1 
\ n ~ (1 - x)"+i ' 



(1-X)2 



and "^{bk + l)x^ 



(6- l)x + 1 
(l-x)2 ■ 



E 

k=0 



k + 2 



(l-x)2 



and 



E 

k=0 



bk 



x2 (6(6 - 3) + 2) + X (6(6 + 3) - 4) + 2 
2(1 -x)3 • 



Dividing the right-hand sides of these expressions by 6" • h{l) (here 6 and 6^ respectively), 
then multiplying by (1 — x)"+^ and letting 6 — > oo, gives pn{x)/n\ (here x and (x^ + x)/2 
respectively) . 

5.2. Autoregressive approximation. This section studies the large n limit of the carries 
process and shows it is well approximated by a classical autoregressive process. Throughout, 
we work with a general base 6 and let n be the number of summands. Let kq = 0, ki, K2, ■ ■ ■ 
be the carries process on {0, 1, . . . , n — 1}. Let It = {ki — n/2)/yn/12, t = 0, 1, 2, ... . 
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Theorem 15.21 relates It to a Gaussian autoregressive process VFo) M^ij W^2) • • • defined by 
Wq = —VSn, Wt+i = ^ + et, with the et independent Normal(0, 1 — p-) random variables. 

Theorem 5.2. Let Pn be the law of the process Yf, < t < oo, on M°°. Let Q be the law 
of the process Wq, Wi, ... on Then Pn ^ Q as n ^ oo. 

The following lemma will be helpful for proving Theorem 15.21 
Lemma 5.3. The base b carries process can be represented as follows: 
(5.4) If Kt = r mod b, let Kt+i = -^y— + ^t+i 

where F(et+i = A;) is the chance that the sum of n + 1 independent discrete uniforms on 
{0, 1, • • • ,6 — 1} is equal to bk + b — r — 1, given that the sum is congruent to b — r — 1 mod 
b. 



Proof. From page 140 of Holte [2j] one can write the carries transition probability as 

(5.5) P(i, j) = i](l + X + --- + x'-'r+' 

where [x^]f{x) denotes the coefficient of x'^ in a polynomial /(x). If i = r mod b, write 
j = ^ + €t+i. Then (j5.5p becomes 

To see that this implies the lemma, note that the sum of n + 1 discrete uniforms on 
{0, 1, • • • ,6—1} is equidistributed mod 6, and so is congruent to 6 — r — 1 mod 6 with 
probability ^. □ 

Proof. (Of Theorem 15. 2p We show convergence by showing that {Pn}^=i is tight and that 
the finite dimensional distributions of Pn converge to the finite dimensional distributions 
of Q. This is enough from [13, 2.2, 4.3, 4.5]. From [10, 2.4], P„ is tight if and only if the 
family Pj^ of hth marginal distributions is tight. Thus it is enough to show that the finite 
dimensional distributions converge. 

By Lemma 15.31 the carries process can be represented as: 

(5.6) If Kt = r mod 6, let Kt+i = , + et+i 

b 

where P(ei+i = k) is the chance that the sum of n + 1 i.i.d. uniforms on {0, 1, • • • , 6 — 1} is 
equal to 6A; + 6 — r — 1, given that the sum is congruent to 6 — r — 1 mod b. Hence bet+i (for 
= 6—1 mod b) has the distribution of the sum of n + 1 uniforms on{0,l,-- - ,6—1} given 
that the sum is congruent to mod b. A generating function argument then shows that if 
n > 2 then e^+i has mean (l — and variance ^^j^ (l — By the local central limit 
theorem for sums of i.i.d. random variables. 



et+i - § (1 - i) 



»AA(0,1) 

as n — > CO, and a similar argument gives the same conclusion for Kt congruent to any r mod 
b, with error term 0(n~^/^) since b is fixed. 
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From these considerations, the joint distribution of (ei, e2, . . . , eh), normalized as above, 
converges to the product of h independent standard normal variables {h fixed, n large). 

Next, represent Kt+i = ^' j^*^' + e^+i with 6t = Kt mod b. Thus, for t = 1, 2, 3, . . . , /i — 1, 
with kq = 6o = 0, 

, / ef ei\ f6t 6t-i 6o 



Since Kt+i = ■ Y^+i + ^ and kq = y/n/12 • Yq + it follows that 

y _ Yo n/2 f 1 \ 1 / e, 

1 f _^ ^^t-i _^ ^ ^0 



Yo 1 



+ 



2' h' ¥ 



1 / (5t (5o 



As noted earlier, the - — ^, — converge to independent AA(0, 1 — ct)'s. Since < 6 for 

■y/n/12 

all i, the term involving the (5's converges to with probability 1, so by Slutsky's theorem, 
it can be disregarded. We thus have that the joint distribution of {Yq, Yi, . . . , Y^) converges 
to the joint distribution of (Wq, Wi, . . . , Wh), and the proof is complete. □ 

Remark. The Gaussian autoregressive process ^n+i = -j^n+en+i (with Xq = x) is carefully 
studied in [13]. It has eigenvalues 1, 1/6, 1/6^, . . . and takes order log;, \x\+c steps to converge 



171 . Prop. 4.9]. Taking x = — \/3n as in Theorem 15. 2^ this is consistent with our result in 
ISection "31 that ^ log^(n) + c steps is the right answer for the carries chain. 

Remark. Theorem 15.21 implies that many properties of Gaussian autoregressive processes 
(here the discrete time Ornstein-Uhlenbeck process) apply to carries-at least in the limit. 
For example Corollary 2 of Lai [l^] implies, in the notation above, that 



IP(W^t ^ bt i.o.) = 1 or according as '''^^ = 00 or < 00. 

t=o 

It follows for carries that 



Kt> — + J — (log(t))^+'^ i.o. ) = 1 or according as e < or e > 0. 
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